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Abstract: This paper unifies and extends results on a class of multivariate Extreme 
Value (EV) models studied by Hougaard, Crowder, and Tawn. In these models 
both unconditional and conditional distributions are EV, and all lower-dimensional 
marginals and maxima belong to the class. This leads to substantial economies of 
understanding, analysis and prediction. One interpretation of the models is as size 
mixtures of EV distributions, where the mixing is by positive stable distributions. A 
second interpretation is as exponential-stable location mixtures (for Gumbel) or as 
power-stable scale mixtures (for non-Gumbel EV distributions). A third interpreta- 
tion is through a Peaks over Thresholds model with a positive stable intensity. The 
mixing variables are used as a modeling tool and for better understanding and model 
checking. We study extreme value analogues of components of variance models, and 
new time series, spatial, and continuous parameter models for extreme values. The 
results are applied to data from a pitting corrosion investigation. 

1. Introduction 

Multivariate models for extreme value data are attracting substantial interest, 
see e.g. Kotz and Nadarajah (2000) and Fougeres (2004). However, with the ex- 
ception of Smith (2004) and Heffernan and Tawn (2004), few applications involving 
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more than two or three dimensions have been reported. One main apphcation area 
is environmental extremes. Dependence between extreme wind speeds and rain 
fall can be important for reservoir safety (Anderson and Nadarajah (1993), Led- 
ford and Tawn (1996)), high mean water levels occurring together with extreme 
waves may cause flooding (Bruun and Tawn (1998), de Haan and de Ronde (1998)), 
and simultaneous high water levels at different spatial locations pose risks for large 
floods (Coles and Tawn (1991)). Another set of applications is in economics where 
multivariate extreme value theory has been used to model the risk that extreme 
fluctuations of several exchange rates or of prices of several assets, such as stocks, 
occur together (Mikosch (2004), Smith (2004), Starica (1999)). A third use, perhaps 
somewhat unlikely, is in the theory of rational choice (McFadden (1978)). Below 
we will also consider a fourth problem, analysis of pitting corrosion measurements 
(Kowaka (1994), Scarf and Laycock (1994)). 

The papers cited above all use multivariate Extreme Value (EV) distributions. 
The rationale is the "extreme value argument" : maxima of many individually small 
variables often have approximately a (univariate or multivariate as the case may be) 
extreme value distribution. However in "random effects" situations this argument 
becomes less clear. Suppose e.g. a number of groups each has its own i.i.d variation 
but in addition each group is affected by some overall random effect. Then, is it the 
unconditional distributions which belong to the extreme value family, or is it the 
conditional distribution, given the value of the random effect? In many situations the 
extreme value argument seems equally compelling for unconditional and conditional 
distributions. So, should one use an EV model for the conditional distribution; or 
is it perhaps the unconditional distributions which are extreme value? 

In the present paper this problem is overcome by using models where both con- 
ditional and unconditional distributions are EV. The models have the further at- 
tractive properties that all lower-dimensional marginals belong to the same class of 
models, and that maxima of all kinds, e.g. over a number of "groups" with differing 
numbers of elements, also have distributions which belong to the class. 
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The models are obtained by mixing EV distributions over a positive stable distri- 
bution. They were first noted by Watson and Smith (1985) and, in a survival anal- 
ysis context, apparently independently introduced by Hougaard (1986) and Crow- 
dcr (1989). Further interesting applications of such models were made in Crowder 
(1998). The most general versions of these distributions were called the asymmetric 
logistic distribution and the nested logistic distribution by Tawn (1990) and McFad- 
den (1978) and were further studied in Coles and Tawn (1991). Crowder (1985) and 
Crowder and Kimber (1997) contain some related material. However, we believe 
that the full potential of these models is still far from being realized. In this pa- 
per we have attempted to take three more steps towards making them more widely 
useful. 

The first step is to revisit the papers of Hougaard, Crowder and Tawn, to col- 
lect and solidify the results in these papers. We concentrated on two parts: the 
physical motivation for the models, and a clear mathematical formulation of the 
general results. The second step is to use the stable mixing variables not just as 
a "trick" to obtain multivariate distributions, but as a modeling tool. Insights ob- 
tained from taking the mixing variable seriously are new model checking tools, and 
better understanding of identifiability of parameters and of the model in general. 
The final important step is the realization that through suitable choices of the mix- 
ing variables it is possible to obtain new natural time series models, spatial models, 
and continuous parameter models for extreme value data. This provides classes of 
models for extreme value data which go beyond dimensions two and three. 

It is not immediately obvious from the forms of the asymmetric logistic distri- 
bution and the nested logistic distribution how to simulate values from them, see 
e.g. Kotz and Nadarajah (1999, Section 3.7). However the representation as stable 
mixtures makes simulation straightforward. According to it, one can first simulate 
the stable variables, using e.g. the method of Chambers et al. (1976), and then 
simulate independent variables from the conditional distribution given the stable 
variables, cf. Stephenson (2003). This adds to the usefulness of the models. 
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Our results can be presented in two closely related ways: as mixture models 
for Gumbel distributions, and as mixture models for the general family of EV dis- 
tributions. We first present the results for Gumbel distributions. The Gumbel 
distribution has a special importance. It occurs as the limit of maxima of most 
standard distributions, specifically so for the normal distribution. In fact, it is the 
only possible limit for the entire range of tail behavior between polynomial de- 
crease and (essentially) a finite endpoint. Another reason is the approximate lack of 
memory property of the locally exponential tails, which goes together with it. The 
Gumbel distribution is known to fit well in many situations, e.g. for pit corrosion 
measurements (Kowaka (1994)). 

We present three motivations/interpretations of the Gumbel models. One is as 
an exponential-stable location mixture of independent Gumbel distributions with 
the same scale parameter. A second interpretation is as size mixtures of extreme 
value distributions, where the mixing is by positive stable distributions. A third 
interpretation is through a Peaks over Thresholds (PoT) model with a positive 
stable random intensity. 

We also develop the models in the general EV setting. In it, two out of three 
physical motivations for the model, as size mixtures and as maxima in a Peaks over 
Thresholds model with a doubly stochastic Poisson number of large values are the 
same as for the Gumbel model. The counterpart to the remaining Gumbel interpre- 
tation, as a location parameter mixture, is that the multivariate EV distributions 
are obtained as scale mixtures with an accompanying location change which keeps 
the endpoints of the distributions fixed. 

The basic motivations and explanations of the models for the Gumbel case are 
collected in Section [2] below. In Section[3]we rederive and remotivate the asymmetric 
and nested logistic multivariate Gumbel distributions and introduce new classes 
of multivariate Gumbel models for time series, spatial, and continuous parameter 
applications. In Section H] we discuss estimation in the random effects model and in 
a hidden MA(1) model. These models are then used to analyze a data set coming 
from an investigation of pitting corrosion on the lower hemflange of a car door. The 
section also uses new model checking tools. Properties of the exponential-stable 
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mixing distributions are given in Section [5j Section [6] translates the Gumbel results 
and models from Sections [21 [Sj and H] to the general EV family. Section [7] contains 
a small concluding discussion. 



In this section we revisit the physical motivations/interpretations for the models, 
and add one of our own - as a "size mixture". We present the motivations in a 
new setting which seems particularly illustrative. This situation is a standard type 
of pitting corrosion measurement. In it a number of metal test specimens, e.g. 
from the body of a car, are divided up into subareas, called test areas, and the 
deepest corrosion pit in each of the test areas is measured. The presumption is that 
there may be an extra variation between specimens (due to position) which is not 
present between test areas from the same specimen. In Section 4 we analyze such 
an experiment. One cause of extra variation in this experiment was the randomness 
in the proportion of the surface which was covered by corrosion-preventing coating. 
There undoubtedly were other causes, such as differences in exposure to dirt and 
salt. However, for the present purposes of illustration we mainly talk about the 
variation in the size of the surface cover. 

We introduce the ideas in the one-dimensional case. The motivations, however, 
extend directly to the new multivariate models which are treated in subsequent 
sections and which are the main interest of this paper. 

The mathematical basis is the following observation. Let 5" be a standard positive 
a-stable variable, specified by its Laplace transform 



where necessarily a G (0,1]. (When a = 1, S is taken to be identically 1, see the 
discussion in Section 5.) Further, let the random variable X be Gumbel distributed 
conditionally on S, 



2. Mixtures of Gumbel distributions 



(2.1) 
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Hence unconditionally X also has a Gumbel distribution, but the mixing increases 
the scale parameter a of the conditional Gumbel distribution to a /a. 

We will sometimes use the terminology that the distribution of X is directed by the 
stable variable S. Let G ~ Gumbel(/i, a) mean that the random variable G has the 
distribution function (d.f.) exp(— e ^). If S has the distribution specified by (j2.ip . 
the variable M = ^+a\og{S) will be called exponential-stable with parameters a, 
and (J. The symbols M ~ ExpS(Q, fi, cr) will be used to denote such a distribution. 

We will give equation (j2.3|) three different interpretations. The first one was used 
by Crowder (1989) in the context of a "first order components of variance" setting (cf 
also Hougaard (1986)). The third one was put forth by Tawn (1990), and discussed 
in a wind storm setting. 

(i) Gumbel distribution as a location mixture of Gumbel distributions: If G and 
M are independent and G ~ Gumbel(/ii, cr) and M ^ ExpS(a, /X2, c) then G + AI ~ 
Gumbel(^i + /i2,o"/o)- This follows by replacing fi in (j2.2p and (j2.3p by fii + fi2- 

For the pitting corrosion measurements, the interpretation would be that the 
maximal pit depth in a test area had a Gumbel distribution with a random location 
parameter fii + M. The value of M would depend on the extent to which the test 
area was exposed to corrosion. 

Briefly going beyond the one-dimensional model, it would be natural to assume 
that different test areas would have different G-s but that the variable M would be 
the same for all test areas on the same specimen, and different for different test 
specimens. A further remark is that in this model it is not possible to separate ni 
and /i2. However, the parameters can be made identifiable by assuming that either 
^1 or ^2 is zero. 

(ii) Gumbel distribution as a size mixture of Gumbel distributions: If the maximum 
over a unit block has the Gumbel d.f. exp(— e ~) and blocks are independent 
then the maximum over n blocks, or equivalently over one block of size n, has the 
d.f. 



(2.4) 



(exp(- 
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In this equation it makes sense to think of non-integer block sizes and random block 
sizes. In particular, it makes sense to replace n by Se^'^l" in (12. 4p to obtain the 

again follows from (j2.ip that the unconditional 
distribution is Gumbel(/xi + /X2, cr/a). Thus the Gumbel(/xi + ;U2, cr/a) distribution is 
obtained as a "size mixture" of Gumbel(/ii, cr) distributions, by using the stable size 
distribution Se^'^l" . As before, to make the model identifiable, one should assume 
that either [i\ or 112 is zero. 

The interpretation in the corrosion example is that Se^'^^" is the "size" of the 
part of the test area which is exposed to corrosion. This size of course cannot be 
negative. Further it could reasonably be expected to be determined as the sum of 
many individually negligible contributions. Suitably interpreted, these two proper- 
ties together characterize the positive stable distributions. 

Next, it is well known that maxima of i.i.d. variables asymptotically have a 
Gumbel distribution if the point process of large values asymptotically is a Poisson 
process. More precisely, if are suitably linearly renormalized values of an 

i.i.d. sequence {1^} and U = i/n, then the point process Ylii^{UXu i) t^nds to a 
Poisson process in the plane with intensity dK = dt x d{e~^^~^^^'^) if and only if the 
probability that maxi<j<„y„_j < x tends to exp(— e <^ ), see e.g. Leadbetter et 
al. (1983). Our third interpretation of the Gumbel mixture model is obtained by 
replacing the constant intensity in the point process by a stable one. 

(in) Gumbel distribution as the maximum of a conditionally Poisson point process: 
Suppose X is the maximum y-coordinate of a point process in (0, 1] x R such that 
conditionally on a stable variable S the point process is Poisson with intensity 
dA = Sef'-'/^dt X (i(e"(^-''i)/'^). Then, by the same argument as above, conditionally 
on S", the variable X has d.f. exp(— Se^^/'^e ~), and as for (|2.3p . it follows that 
the unconditional distribution of X is Gumbel(/ii + ;U2,(T/a). 

In the corrosion example, the points in the point process correspond to pit depths 
on the surface of the test area. The random intensity Se^'^l" then would describe 
an extra stochastic variation in intensity of pits from test area to test area. Again 
this has to be positive and perhaps is obtained as the sum of many individually 
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negligible influences, and hence approximately positive stable. As above, one of /xi 
or fi2 should be assumed to be zero for identifiability. 

It may also be noted that in some situations it may be possible to use PoT obser- 
vations, i.e. to actually observe the underlying large values, say all deep corrosion 
pits in each test area. Such measurements could also be handled within the present 
framework, by substituting the likelihoods in this paper with the corresponding 
point process (or PoT) likelihoods. However, we will not pursue this further here. 

By way of comment, the logarithm of the positive stable distribution which occurs 
in the location mixture (i) has finite moments of all orders. In contrast, the posi- 
tive stable variables themselves have infinite means. This, however, seems largely 
irrelevant both for the mathematics of the models and for modelling. 



3. New classes of Gumbel processes 

In this section we introduce a number of concrete Gumbel models directed by 
linear stable processes: a random effects model, time series models with directing 
stable linear processes, and a spatial model with a stable moving average as directing 
process. We also consider a hierarchical setup and continuous parameter models. 

However, to provide a solid foundation for this paper and future developments, 
we first give a precise mathematical formulation of results of Tawn (1990). This 
shows the exact relations between the three interpretations given in Section [2] in 
a general setting, and slightly generalizes (a restriction on the size of the set A is 
removed) Tawn's main result. 

Let T and A be discrete index sets, where in addition T is assumed to be finite. 
Further let {ct^a} be non-negative constants and let {Sa,a € ^4} be independent 
positive Q-stable variables with distribution specified by ()2.ip . We assume without 
further comment that Y2aeA '^t,aSa converges almost surely for each t. 



Proposition 1. Consider the following three models: 

ct,aSa), t G T, where Gt ^Gumbel^ntjCrt), and the Gt-s and 

aeA 

Sa-s all are mutually independent. 

(a) Xt,t E T are conditionally independent random variables given Sa,a S A, with 
marginal distributions 



-(5^Q,„5,)e \ 
aeA J 



t G T. 



(Hi) For t gT, Xf is the maximum y-coordinate of a point process in (0, 1] x i? such 
that conditionally on Sa,a € A the point process is independent and Poisson with 
intensity {EaeACt,aSa) dt x a!(e-("-^*)/'^')- 

Then all three models are the same, i.e. they have the same finite dimensional 
distributions: 

(3.2) PiXt < xt,t e T) = Hew i-{Y,ct,ae~'^r] , 

aeA \ teT J 

and this distribution is a multivariate extreme value distribution. 

Proof. By the form of the Gumbel distribution function, (i) imphes that (ii) holds. 
Similarly, by the same argument as for (iii) of Section 2 above, it follows that (iii) 
of the proposition implies (ii). Further, that (ii) implies ()3.2p follows immediately 
from (|2.1|) since, by independence of the {Sa}, 

teT aeA / 

= llEiexp[-Sa{^ct,ae-^)]\. 

aeA \ teT / 

It is obvious that the distribution (j3.2p is max-stable, and hence an EV distribution. 

□ 

As discussed in the introduction, a class of multivariate extreme value mixture 
models is most useful if (a) both unconditional and conditional distributions are 
extreme value, (b) lower-dimensional marginal distributions also belong to the class, 
and (c) maxima over any subsets have joint distributions which belong to the class. 
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Now, (a) is a part of Proposition 1. Further, if one sets some of the Xt in (j3.2p equal 
to infinity the corresponding terms in the sum in the right hand side vanishes, but 
the expression stiU is of the same general form, and hence the model ()3.2p satisfies 
the requirement (b). 

The model also satisfies (c) if one imposes the extra restriction that all the scale 
parameters have the same value, i.e. that at = cr, for t (z T. For the marginal 
distribution of a maximum this is because if Ti C T then 

(3.3) ^(maxXt < x) = J] exp -( J]] Q,aeV)°e"^ 

or equivalently 

maxXt ~ Gumbel | (ct/q) log(^(^ ct,ae^*/'")"), a/a 

\ aeA teTi 

In particular, by letting Ti be a one point set we see that in this case marginals are 
Gumbel distributed, 

Xt ~ Gumbel [(a/a) log(^(ct,ae'^*/'^)"), a/a J . 

V aeA J 

Moreover, joint distributions of maxima also belong to the class (13. 2p of distribu- 
tions. E.g. let Ti and T2 be disjoint subsets of T and set CT,,a = J2teT, ^t,a sxp (/it/a), 
for i = 1,2. Then, as can be seen from (13. ip or ()3.2I) . 

n/ ^1 ^2 \ 

exp ( -(cTi,ae"~ + CT2,ae~~)°'] , 

"^^^ "^^"^ aeA 

which has the form (j3.2p . Similar but more complicated formulas hold when more 
subsets are involved and when the subsets can overlap. 

These two properties are touched upon by Crowder (1989) in a less general situ- 
ation, and also by Tawn (1990). 

Conditions (i) - (iii) in Proposition 1 correspond to the three "physical" inter- 
pretations in Section 2. We now turn to a number of specific models. Which 
interpretation is most relevant of course varies from model to model. E.g the first 
model below is the standard logistic model for extreme value data, but with the 
interpretation as a random effects model. We will use it on a pit corrosion example, 
where perhaps the interpretation (ii) is most compelling. However, to streamline 
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presentation, we will for the rest of this section formulate the models as in (i), but 
of course could equally well have used (ii) or (iii). 

Example: A one-way random effects model. This is the model 

(3.4) Xjj = // + Tj + Gjj, 1 < i < m, 1 < j < Ui 

with /i a constant, Tj ~ExpS(a, 0, a), Gij ~Gumbel(0, o") and all variables indepen- 
dent. 

Setting T = 1 < i < m, l<j< n-j}, A = {1,2,... ,m} and Ci^ij^^f. = 

l^^^j^j, this is a special case of the situation in Proposition 1 and we directly get the 
distribution function 

m Ui _ 

(3.5) PiXij < Xij, l<i<m, l<j <n,) = J] exp(-(^ e"^)"). 

i=i j=i 

According to Proposition 1 and the subsequent remarks this is a multivariate 
EV distribution, and explicit formulas are directly available for the distribution 
of all kinds of unconditional and conditional maxima. In particular the marginal 
distributions are Gumbel(^, a*) for a* = a/a. □ 

This model can be extended to higher order random effects models which are 
"linear on an exponential scale". It can also be natural, for instance in a "repeated 
measurements" setting, to let be a function of t, perhaps depending on the values 
of known covariates, as done in Crowder (1989, 1998) or Hougaard (1986). Note 
however that in the context of repeated data, say {Yi,--- ,Yp), the set T from 
Proposition 1 has to be T = l<i<p, l<i< Ui}, whereas we allow more 

general T's. 

We next turn to time series models. A linear stationary positive stable process 
may be obtained as Ht = YliZ-oo^i^t-i^ where the Si have distribution (j2.ip . the 
bi are nonnegative constants, and the sum converges in distribution if X] < 
Defining 



(3.6) 



Xt = fit + (7log{Ht) + Gt, 
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for some constants /ij gives a Gumbel time series model. In particular (j3.6p includes 
hidden ARMA models. We next look closer at the two simplest cases of this. 

Example: A hidden MA-process model. Suppose Ht = boSt + biSt-i + • • ■ + bqSt-q 
and Xt is defined by (|3.6p . where the Si have distribution (j2.ip . Gt ~Gumbel(0, o") 
and all variables are mutually independent. Then, by Proposition 1 with T = 
{l,...,n},andA = {0,±l,...}, 

n n/\(k+q) 

(3.7) P{Xt <xul<t<n)= n exp(-( b^.^e"^)"). 

k=l-q t=lVk 

□ 

Example: A hidden AR-process model. For < p < 1 define the positive stable 
AR-process Ht by Ht = J2iZo P^^t-ii and let Xt be given by (|3.6p . with the 5j and 
Gt as before. From the definition of Ht, 

oo 

(3.8) Ho = ^p'S.i 

i=0 

Hi = pHo + Si 

Hn = p''Ho + p''-^Si + ---+pSn-l+Sn, 

and in addition, by (12. ip Hq has the same distribution as 

oo 
1=0 

and is independent of 5i, . . . , 5„. Thus, the model is again of the form considered in 
Proposition 1, with T = {0, . . . , n}, A = {0, ±1, . . . } and q,o = p*(l-/o")"^/", ct,a = 
pt-a £qj. ^ _ _ _ _ ^ ^ g^j^fi = Q otherwise. Thus by Proposition 1 the distribution 
function is 

n n n 

P{Xt <xt,0<t<n)= exp[-(l-p-)-i(j;/e-"^)"]nexp(-(^p*-V-"^)"). 

t=0 1=1 t=i 

□ 
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In the next example we consider models on the integer lattice in the plane. Let 
n(jj) be a system of neighborhoods with the standard properties (i, j) G '^^(ij) and 
{k,l) £ ihj) ^ '^(fc,0- ^ simple example is when the neighbors are the four 

closest points and the point itself, i.e. when n(jj) = (i — 1, j), {i + — 

Example: A spatial hidden MA-process model. Let {Si j;—oo < i,j < 00} be 
independent standard positive a-stable variables and set H^ j = Yl(kl)en(^- ■^^'^k,l 
where 6 is a positive constant. Put 

where the Gij are mutually independent and independent of the Sij, and Gij ~ 
Gumbel(0,(T). Again this is of the form considered in Proposition 1, now with 
(^(i,j),{k,i) = 5 ii {i,j) G ^(A;,^ and zero otherwise. To write down the joint distribution 
function it is convenient to use the notation n(^k,l) = '^(fc.O {(hj)] 1 ^ i^j ^ 
We then get that 

P{Xi,, < Xi,,; 1 < i,j < n) = n exp(-5"( J] e"^"^)"). 

(fc,0 («j)Gn(fe,i) 

□ 

We now turn to a situation not covered by Proposition 1, the so-called nested 
logistic model of McFadden (see Tawn (1990)). 

Example: A two-layer hierarchical model. Consider the model 

Xi,j,k = lJi + Ti + rjij + Gij^k, 1 <i<m, 1 < j < n,, 1 < A; < r^j, 

with /i a constant, Tj ~ExpS(/3, 0, a/a)^/", r]ij ~ExpS(/3, 0, a), Gij^k ~Gumbel(0, a), 
and all variables independent. By repeated conditioning we obtain, after some cal- 
culations similar to the proof of Proposition 1, 

P{Xi,j,k < Xi^i,k, I <i<m, l<j<ni, l<k < rij) 

m Tli fi,] _ 

i=i j=i k=i 

□ 
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There also are continuous parameter versions of Proposition 1. Let {5j(s);s € 
R^} be independently scattered positive stable noise (see Samorodnitsky and Taqqu 
(1994, Chapter 3)). We assume that the noise is standardized, so that for nonnega- 
tive functions f & L^, 

/oo roo 
/(s)5,((is)}] =exp(- / /(s)-(is). 
-oo J — oo 

In the sequel we will without comment assume that functions /(•) are such that 
integrals converge, and integrals are taken to be over R'^. 

Proposition 2. Suppose that there are nonnegative functions /j(t,s) with t G R^, 
s £ R'' such that 

m . 

Xt = Gt + at\og{Y^ / fj{t,s)Sj{ds)), t = ti,...,t„, 

where Gt ^Gumbel{fi,at), and all variables are mutually independent. Then 
(3.10) P(Xt, <xt,; i = l,...,n) = J]exp(- /(J^/,(t„s)e ^)"ds). 

j=l i=l 

The proof follows from (j3.9|) in the same way as Proposition 1 follows from (j2.ip . 
The interpretations (ii), as size mixtures, and (iii) as a random Poisson intensity 
could equally well have been used as assumptions. However, this we leave to the 
reader. 

Proposition 2 gives a natural model for environmental extremes, such as yearly 
maximum wind speeds or water levels, at irregularly located measuring stations. 
E.g. one could assume years to be independent and obtain a simple isotropic model 
for one year by choosing k = £ = 2, m = 1 and /i(t, s) = exp(— (i|t — s|^), for some 
constants d, P > 0. One extension to non-isotropic situations is by letting D he a 
diagonal matrix with positive diagonal elements and taking /i(t,s) = exp(— (t — 
s)*L'(t — s)P). (Formally the entire distribution function for n years is also of the 
form (|3.10p . as can be seen by taking i = 3, m = n and letting the different Sj 
correspond to different years.) It is possible to derive recursion formulas for the 
densities of these models in a similar but more complicated way as for the random 
effects model. If the number of measuring stations is not too large, these expressions 
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may be numerically tractable. However, we will not investigate this further in this 
paper. 

4. Data analysis 

In this section we illustrate the random effects model and the hidden MA(1) 
model from Section [3] by using them to analyze a set of pit corrosion measurements. 
As preliminaries we first discuss maximum likelihood estimation in the two models. 

4.1. Estimation in the random effects model. Let < cr < a*, — oo < /i* < oo, 

so a := a/a* € (0, 1). Assume a data set X that comes from m groups, 

group 1 : Xi^i,Xi^2,- ■ ■ ,Xi^ni 
(4.1) group 2: ^2,1, ^2,2, • • ■ , ^2,n2 

group m : XmA , ^m,2 , • • • , Xm,nm ■ 

The groups are assumed to be independent and the i^^ group comes from a Gumbel(0, a) 
distribution, where the location parameter ^Uj for group i is drawn from an ExpS(a = 
a/a*, fj,*,a) distribution. The goal is to estimate the three parameters 6 = {a, a* , fi*) 
from the data by maximum likelihood. 

The likelihood L(^|X) = J^^-^ Lj(0|Xj^i, . . . , Xj^„.) is the product of the group 
likelihoods. Each of these terms can be derived by differentiating (j3.5p with respect 
to xi,...,Xn- The direct calculations are complicated, but Property (1) of Shi 
(1995) gives recursions for the likelihood function for the group in terms of certain 
coefficients {qnj}- 

The maximum likelihood algorithm has been implemented in S-Plus/R. The esti- 
mation procedure numerically evaluates £(0|X) = logL(0|X) and numerically max- 
imizes it to find the estimate of 6. The search is initialized at '■= (^o/^, coj A'o)) 
where /xq and ctq are estimates of the Gumbel parameters for the (ungrouped) data 
set X. This estimate is found by using the probability-weighted moment estimator, 
see e.g. Section 1.7.6 of Kotz and Nadarajah (2000). 

Usually this makes it straightforward to find maximum likelihood estimates by 
numerical optimization. However, if a group is large or a is small, the coefficients 
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in the recursion can be very large. E.g. the constant term in Shi's notation is 

-iV 



Infi 



n^-A n^-i] "-^-i 



a J \ a J \ a 
In some cases this can cause numerical overflow in the optimization routines. Fur- 
ther, if all groups only have one value or if there is only one group then parameters 
are not identifiable. Presumably parameter estimates will be unreliable if data is 
close to these situations. This however was not the case for the corrosion data in 
Section 4.3 below. Besides, we made rather many simulations (not included in the 
paper) from both random effects Gumbel model and independent Gumbel model 
with arbitrary means, and checked on these simulations that the maximum likeli- 
hood estimators perform reasonably well, as soon as there are a few groups, and 
even when some of the groups are rather small. 

In passing we note an alternative way to derive the likelihood, which in addition 
indicates a possibility to compute it by simulation. A group likelihood, conditional 



on r, IS 



n 

i=i 



1 Xj-ll-T 



a 



exp < — e 



En 



exp < 



where r = cr log 5" and S is a standard a— stable variable, as previously. Hence, a 
group likelihood is 



a" 



5" exp < 



Let A = Yll=i ^ ^)/'^. Then, the expectation in the last expression reduces to 



E [S 



E 



jn "I jn 

in L J ^ ' J/\n L J ' 



dA" J dA'' 

where the second equality makes one more use of the stable distribution of S. 

4.2. Estimation in the hidden MA(1) model. By (|37ll the hidden MA(1) 
model with constant location parameter, /it = fJ- and, for identifiability, bo = l,bi = b 
has distribution function 

(4.2) F = P{Xt<xt, l<t<n)=exp|^-|(6zi)" + |^(zt + 6zt+i)" + z;j|^ , 



17 

where Zt = exp(— (x^ — The parameters of the model are 6 = {fi,b,a,a). By 

difFerentiation with respect to xi,. . . ,Xn the hkehhood function can be seen to be 
of the form 

n 

with F from (j4.2p and Qn defined recursively as follows. Set ui = bzi, ut = zt-i + bzt 
for t = 2, . . . , n, n„_|_i = Zn. Then F = exp(— Yl]^=i ^t) 

Qo = 1, Qi = a + 4"^) , 

Qi = -Qi-2a(a-l)6<"^ + Qi-i«(K"^ + <+/)' i = 2,...,n. 

When 6 = 0, the Qi term above should be interpreted as Qi = au2~^ , which makes 
the likelihood formula valid in the case where the xt are independent. 

Maximum likelihood estimation of the parameters (/u, b, a, a) has been imple- 
mented in S-Plus/R, where 

n+l / \ 

log{L(0|X)} = log Qn - E< - E ( -nloga 

t=i t=i ^ ^ ^ 

is computed and numerically maximized. As default the search is started at (/u = 

fj,o,b = 0, o" = o"o/0.5,a = 0.5), where {fiQ^ao) are the Gumbel probability-weighted 

moment estimators for the data set. In ad hoc simulations to test this method, we 

sometimes observed that results were sensitive to the choice of starting values when 

the sample size was small. Apparently the likelihood surface has local maxima in 

such cases. To deal with this problem, we started the search at several different 

randomly chosen points and chose as estimator the final values which gave the 

highest likelihood. 

4.3. Pitting corrosion data analysis. The pitting corrosion investigation which 
generated this data set was briefly mentioned in the beginning of Section [2l Specif- 
ically, pieces (or "test specimens" ) were cut out from different parts of the bottom 
hemflange of the aluminum back door of a twelve year old station wagon. The 
corrosion products were dissolved from the pieces, and the deepest corrosion pit 
was measured in a number of one centimetre long test areas on each specimen. 
The hemflange had been glued together and had also been treated with a corrosion 
preventing coating. Surface areas where the glue or coating was intact showed no 
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corrosion. However, in some places the glue and coating had not penetrated well or 
had fallen of, leaving the surface exposed to corrosion. The proportion of the area 
which could corrode varied between specimens, and this was a potential cause of 
extra variation in the corrosion measurements. These areas, however, had not been 
measured (and it would have been difficult to do so) and there were other causes of 
extra variation, such as varying exposure to salt. 

Interest was centered on the risk of penetration by the deepest corrosion pit 
on the outer surface of the hemflange. The data set for this surface consisted of 
microscope measurements (in microns) of the maximum pit depth in 11 to 15 test 
areas on each of 12 specimens. There was no corrosion on 5 of the test specimens, 
and on one specimen only two test areas showed any corrosion. These 6 specimens 
were excluded from our analysis. Also in the remaining specimens there were some 
corrosion free test areas, and the data we used for analysis hence consisted of 6 
groups (=test specimens) with varying numbers (ranging from 4 to 14) of measured 
maximum pit depths. 

The engineers who performed the experiment disregarded the group structure and 
considered the pooled data set as an i.i.d Gumbel sample. The maximum likelihood 
parameter estimates under this model were (/Upooh Cpooi) = (145.6,69.4). It was 
remarked by the engineers that there seemed to be some deviation from a straight 
line in Gumbel plot, see Figure [7T1 While the overall fit to the pooled data seems 
reasonable, there is clear group structure. 

Include Figure 7.1 here 

We instead modeled and analyzed the data as dependent 4 to 14-dimensional 
random vectors. We first made use of the random effects Gumbel model from 
Subsection 14.11 The aim was both to see if this model fitted better and to check 
wether it lead to a substantially different risk estimate. In addition to the extra 
variation between test specimens there might also be a short range dependence 
between neighboring test areas. We tried to judge the size of short range dependence 
by fitting a hidden MA(1) model on top of the random effects model. 

The maximum likelihood estimates in the random effects Gumbel model were 
{fj,, a, a) = (140.9, 54.1, 0.716) with standard deviations (21.75, 5.71, 0.118) estimated 
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from the inverse of the empirical information matrix. A very rough calculation of the 
risk of perforation can then be made as follows. There are about 15 test specimens 
on a hemflange. Let us assume, as was the case with the present data, that typically 
about 6 of the test specimens will show corrosion and that on average about 11 test 
areas on each specimen will be corroded. Then, by (j3.5p the estimated distribution 
function of the maximum pit depth for one car would be 

F(x) = exp(-6(lle-^^)0-™). 

The thickness of the aluminum was 1.1 mm = 1100 microns and hence we estimate 
that there on the average will be perforation in one out of 1/(1 — -F(llOO)) = 9671 
cars. A delta method 95% confidence interval for this estimate is (8392, 10950). 
If we instead, following the engineering analysis, use the pooled Gumbel model 
with the assumption that typically there are 6 x 11 = 66 corroded test areas on a 
hemflange, the risk estimate is that on the average there is penetration in one out of 
(1 — exp(— 66e 69^4 14374 cars. A delta method 95% confidence interval is 

(13115, 15632). Thus, the random effects model gave a practically and statistically 
significantly different answer than the pooled analysis. 

The formulation as a random effects model gives a number of possibilities for 
model checking. From Figure 17.21 can be seen that the Gumbel distribution fits 
reasonably well to the separate groups, that there indeed seems to be an extra 
variation between groups, and that the fitted lines are approximately parallel. As 
a formal check on this, we made a conditional analysis, fitting separate Gumbel 
distributions to the groups by maximum likelihood. In this we considered three 
different models, the first with separate ^i-s and as for the groups, the second with 
all groups assumed to have the same a but different ^'s for the different groups, and 
a third model with the same a and the same for all observations. A likelihood 
ratio test between the first two models gave p = .53, and hence it seemed reasonable 
to assume the same a in all groups, as is done in the random effects model. A test 
of the second model against third lead to p = 2 • 10^^. Thus the pooled model is 
rejected, while this analysis did not contradict the validity of the random effects 
model. 
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As further checks on the random effects model, the a estimate from the second 
model in the previous paragraph was 47.6, which is reasonably close to the a estimate 
54.1 in the random effects model. Similarly, a* = 75.6 and cxpooi = 69.4 are rather 
close, as they should be. A further comparison is that the correlation coefficient 
estimated nonparametrically from the data was 0.44. This can be compared with 
the correlation coefficient 1 — = 0.49 computed from the fitted model. 

Include Figure 7.2 here 

Figure [73] shows the quantiles of the estimated /i-s against the quantiles of the fitted 
exponential-stable distribution. According to the model, the /i-s are exponential- 
stable, and hence, apart from estimation error, the estimated fi-s are expected to be 
exponential-stable, so this plot is a diagnostic for the fit of the mixing distribution. 
The plot also shows a reasonable fit, and in fact looks much like the same qq-plots 
from simulated values from the model. Thus, neither of these model checks indicated 
problems with the random effects model. 

Include Figure 7.3 here 

As a final analysis we fitted the hidden MA(1) model from Subsection 14.21 to the 
data, since there was a possibility of extra dependence between neighboring test 
areas. In this we assumed groups were independent and had their own /i-s, but that 
a, a and b were the same in all 6 groups. Thus there were in all 9 parameters, the six 
group means /Ui, /i2, /"s, /^4, A's, /"e and the parameters a,a,b. Maximum likelihood 
estimation using the default initial values got stuck in a local maximum, and we 
hence did the optimization for 100 different starting values for a, a, b, chosen at 
random from the cube [7,54] x [0.1,0.99] x [0,2]. As estimates we took the final 
values which gave the highest likelihood. For the fi-s in the 6 groups these were 
87.3, 142.0, 132.4, 140.0, 67.6, 214.8 and the estimators for the remaining parameters 
were a = 29.6, a = 0.58, b = 0.13. 

From the model, the marginal distributions in the groups are Gumbel with loca- 
tion parameter A* + ^ log(l -|- b") and scale parameter a/a. The estimates of these 
agreed to within 5% with their initial values, which indicated that these parameters 
were reasonably well determined by the data. The remaining two parameters, a 
and b, model the dependence structure. The smaller the a and the closer b is to 
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one, the higher is the dependence. These parameters seemed harder to estimate. 
However, their estimated values indicated a rather weak local dependence, and did 
not contradict the validity of the random effects model. 

We accordingly stopped the analysis at this point. If the dependence had been 
judged important, we could have tried to fit a model which included both random 
group means and a local MA(1) dependence. Further model checking, as suggested 
by Crowder (1989, Section 3.3), could be performed by using the probability integral 
transform marginally to get uniform (but dependent) residuals or by computing 
Rosenblatt residuals which are approximately independent if the model is correct. 

In summary: The pooled analysis did not fit the data and lead to significantly 
different results than the random effects model. Instead the random effects model 
seemed to give a good representation of the data - in particular none of the several 
diagnostic checks indicated serious departures from it - and we believe it led to 
credible estimates. By way of further comment, it may be noted that we obtained a 
successful fit of the hidden MA(1) model, and that it produced useful information. 

A weak point in the analysis is the assumption that a hemflange has 6 test speci- 
mens with 11 corroded test areas each. Further the variation in pit depths from car 
to car is not included in the data. If measurements on several cars had been avail- 
able, it would have been natural to try to fit the hierarchical model from Section 

El 

5. Some properties of the mixing distribution 

This section discusses some of the basic facts about the models. In the notation 
of Samorodnitsky and Taqqu (1994), the r.v. S in dUD is 5„((cos7ra/2)V",l,0); 
in the notation of Zolotarev (1986), S ~ Sc{ce, 1, 1). It has characteristic function 

i?exp(it5) = exp {— cos(7ra/2)|t|" [1 — i tan(7ra/2)(sign t)]} . 

Let Fs{s) be the d.f. and fsis) be the density of S. If M --^ExpS (a, cr), 
then the d.f. and density of M are Fm{x) = F5[exp{(x — ^)/o"}] and fuix) = 
exp{(x — fi)/a}fs[e^]i{{x — fj.)/a}]/a. Using the programs for computing with sta- 
ble distributions described in Nolan (1997), it is possible to compute densities, d.f., 
quantiles and simulate values for M. Figure [731 shows the density of some log-stable 
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distributions. The densities all have support (—00,00) and appear to be unimodal. 
Note that as q | 1, S" converges in distribution to 1 and hence M = log S converges 
in distribution to 0. 



It is well-known that the upper tail of S is asymptotically Pareto: as 2; — > 00, 
P{S > x) ^ Co,x~°' where = r(a) sin(7ra)/7r. This implies that the right tail of 
M '^ExpS(a, ;U, a) is asymptotically exponential: as t ^ 00, 



The left tail of S is light, see e.g. Section 2.5 of Zolotarev (1986), so the left tail of 
M is even lighter. Thus all moments of M exist; in particular, using the results of 
Section 3.6 of Zolotarev (1986), 



where ■jEuier ~ 0.57721 is Euler's constant. 

As a simple consequence we derive the correlation between two variables in the 
same group in the random effects model (|3.4p . Suppose Xi = fj.+T+Gi, i = 1,2 with 
r ~ExpS(a, 0, cj), Gi ~Gumbel(0, a) and the three variables independent. Then 
Cov(Xi,X2) = Var(r) and Var(Xi) = Var(T) + Var(Gi). Since Var(G,) = ^ we 
obtain that Cor(Xi, X2) = 1 — a^, which varies from in the independent case a = 1 
to 1 as a ^ 0, which is reasonable since the limit corresponds to full dependence. 

6. Mixtures of generalized extreme value distributions 

The mixture models for the Gumbel distribution discussed so far in the paper 
carry over to the (generalized) EV distribution in a straightforward manner. How- 
ever, the interpretation (i) is different. 

The EV distribution has d.f. exp(— (1 + with parameters /i,7 G -R 

and fj > 0. For positive 7 this distribution has a finite left endpoint 5 = fi — a/j 
and for 7 negative it has a finite right endpoint 6 = fj, + a/\^\. In analogy with (|2.ip 
- (12. 3|) let S be positive stable with Laplace transform (|2.1|) and assume that 



Include Figure 7.4 here 





(6.1) 
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Then by (HI]) 



{ 



X — 11 



} 



l/(7/a) 



(6.2) 



P{X <x) = exp 



1 + (7/a) 



{a/a) 



Thus, in the terminology of (ii) of Section O if X is a positive stable size mixture 
of an EV distribution with location ^u, scale a and shape parameter 7 then also X 
itself has an EV distribution with the same location ^ and the same right endpoint 
(5, but with a new scale parameter ct/q and new shape parameter 7/a. Hence in 
particular the unconditional distribution of X has heavier tails than the conditional 
one. 

The physical motivations (ii) and (iii) from Section [2] carry over to the present 
situation without change. Further, from (j6.ip it can be seen that X may be obtained 
as a special random location-scale transformation of an EV distribution. Specifically, 
if E has an EV distribution with parameters cr, 7 and S is positive a-stable and 
independent of E, then X may be represented as 



Thus X is obtained as a scale mixture with mixing distribution S"*", but in addition 
there is an accompanying location change which is tailored to keep the endpoint of 
the distribution unchanged. This, of course, may be the most natural way to make 
scale mixtures of distributions with finite endpoints. 

With this change, the motivations from Section [2] and the models from Section 
[3] carry over to the EV distribution. If the models in Section [3] are written as size 
mixtures, i.e. in the form (ii), the only changes needed to go from Gumbel to EV 
are to replace e ^ by (1 + ^^^^)~^^'^ in all expressions. The recursions for the 
likelihood functions from Section 5 translate to the EV case similarly. 

It is also straightforward to translate specifications using (i) to the EV case. E.g, 
in the formulation (i) the random effects model (|3.4p becomes 



where -Ej.j has an EV distribution with parameters ^, a, 7 and Si positive a-stable, 
and all variables are mutually independent. In the same way, the hidden time series 



(6.3) 



X = S''E + (1 - S^)5. 



Xi^j = S]Ei^, + (1 - S])5., 
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model (|3.6p in EV form can be written as 

Xt = H^Et + (1 - H^)5, 

with Hf a linear stable process and Ef is EV distributed, and all variables are 
mutually independent. 
Next, 

log{X -6) = ^logS + \og{E-S), 

and if X is of the form (j6.3p with 7 > then log(£' — 6) has a Gumbel distribution 
with location parameter \og{a/fi) and scale parameter 7. For 7 < we instead write 

log{S - X) = ^logS + log{S - E), 

where log(5 — E) has a Gumbel distribution with location parameter log(cr/^) and 
scale parameter 7. Thus the diagnostic plots for Gumbel mixtures could be used 
also for EV mixtures, except that 6 isn't known. A pragmatic way to control the 
model assumptions then is to replace 6 by some suitable estimate. 

7. DISCUSSION 

The pitting corrosion example discussed in Section 4 was the starting point for 
the present research. There it seemed important to use models where marginal, 
conditional and unconditional distributions, and maxima over blocks of varying 
sizes all had Gumbel distributions, since this leads to simple and understandable 
results, and credible extrapolation into extreme tails. 

However it seems important to stay within the extreme value framework through- 
out for many other applications too. This is a main reason for the present work. 
Another is that our results open up a wide spectrum of hitherto unavailable pos- 
sibilities to construct extreme value models for complex observation structures, in 
particular for time series and spatial extreme value data. 

The results also throw new light on some much studied logistic models. In par- 
ticular they point to possibilities for new kinds of model diagnostics. In addition 
they show how one can carry over many of the analyses available for normal models 
to an extreme value framework in a simple and intuitive way. One example of how 
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this can be done is the suggested next step in the analysis of the corrosion data, to 
fit a model which includes both random group means and a MA(1) dependence. 

We believe that many applications of these ideas remain to be explored. One aim 
of this paper is to provide a solid basis for such future research. 
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Figure 7.1. Gumbel plot for the pooled corrosion measurements. 
A different symbol is used for each group. 




Figure 7.2. Gumbel plots made separately for the 6 groups. The 
solid lines are the different theoretical Gumbel fits for each group. 
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Figure 7.3. qq-plot of fitted exponential-stable distribution against 
estimated /x-s from the conditional analysis with the same a in all 
groups. 
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Figure 7.4. Plot of densities of standardized exponential-stable dis- 
tributions ExpS(a,0, 1), with varying a. 



